Reconstruction-based Outlier Detection
Reconstruction-based outlier detection
The reconstruction-based outlier detection methods identify outliers by measuring how well a data point can be reconstructed from a compressed or transformed representation of the original data. The core idea is that the normal points in original data set can be reconstructed with low error but outliers reconstructed will have high reconstruction error. Principal Components Analysis (PCA) is one of the reconstruction-based outlier dection methods.
For example: Let's image we have a dataset with two features, f1 and f2. f1 and f2 are highly correlated. If these data points are plotted in 2D space, most of the points lie along a diagonal line.
The Data points: (2,2), (3,4), (4,6), (5,8), (6,10) these are normal data. Consider another data point (11,3) this is outlier. We can see the normal points follow linear relationship as f2 nearly equal to 2 * f1 - 2. But outlier (11,3) does not fit this pattern.
Step-by-Step